Learning Verb Argument Structure from Minimally Annotated Corpora

نویسندگان

  • Anoop Sarkar
  • Woottiporn Tripasai
چکیده

In this paper we investigate the task of automatically identifying the correct argument structure for a set of verbs. We exploit the distributions of some selected features from the local context of a verb. These distributions were extracted from a 23M word WSJ corpus based on partof-speech tags and phrasal chunks alone. This annotation was minimal as compared to previous work on this task which used full parse trees. We construct a decision tree classi er which achieved an error rate of 33.4%. Our result compares very favorably with previous work despite using considerably less data and requiring only minimal annotation of the data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A large scale annotated child language construction database

Large scale annotated corpora of child language can be of great value in assessing theoretical proposals regarding language acquisition models. For example, they can help determine whether the type and amount of data required by a proposed language acquisition model can actually be found in a naturalistic data sample. To this end, several recent efforts have augmented the CHILDES child language...

متن کامل

Automatic Verb Classification Based on Statistical Distributions of Argument Structure

Automatic acquisition of lexical knowledge is critical to a wide range of natural language processing tasks. Especially important is knowledge about verbs, which are the primary source of relational information in a sentence--the predicate-argument structure that relates an action or state to its participants (i.e., who did what to whom). In this work, we report on supervised learning experimen...

متن کامل

Identifying Verb Arguments and their Syntactic Function in the Penn Treebank

In this paper, we present a tool that allows one to automatically extract verb argument-structure from the Penn Treebank as well as from other corpora annotated with the Penn Treebank release 2 conventions. More specifically, we examine each possible sequence of tags, both functional and categorial and determine whether such a sequence indicates an obligatory argument, an optional argument or a...

متن کامل

Annotation of Predicate-argument Structure on Molecular Biology Text

Annotated corpora are essential resources for natural language processing. This paper describes our approach for building a corpus annotated with predicateargument structure on research abstracts in molecular biology domain. Observation of the records in a database of cell signaling events and corresponding research abstracts showed that extracting predicateargument structure is a useful interm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002